Mergeable Dictionaries 



John Iacono Ozgiir Ozkan 

February 23, 2010 

Abstract 

A data structure is presented for the Mergeable Dictionary abstract data type, which sup- 
ports the following operations on a collection of disjoint sets of totally ordered data: Predecessor- 
Search, Split and Merge. While Predecessor-Search and Split work in the normal way, the novel 
D operation is Merge. While in a typical mergeable dictionary (e.g. 2-4 Trees), the Merge operation 

can only be performed on sets that span disjoint intervals in keyspace, the structure here has 
no such limitation, and permits the merging of arbitrarily interleaved sets. Tarjan and Brown 

(N present a data structure [4] which can handle arbitrary Merge operations in O(logn) amortized 

time per operation if the set of operations is restricted to exclude the Split operation. In the 

X^/y presence of Split operations, the amortized time complexity of their structure becomes O(n). 

A data structure which supports both Split and Merge operations in O(log 2 n) amortized time 
per operation was given by Farach and Thorup 6|. In contrast, our data structure supports all 
operations, including Split and Merge, in O(logn) amortized time, thus showing that interleaved 

'— 1 Merge operations can be supported at no additional cost vis-a-vis disjoint Merge operations. 
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1 Introduction 



Consider the following operations on a data structure which maintains a dynamic collection S of 
disjoint sets {Si, S2, ■ • •} which partition some totally ordered universal set U: 

• S •<— Find(x): Returns the set S £ S that contains x. 

• p <— Search (5, x): Returns the largest element in S that is at most x. 

• (A,B) «- Split(5, x): Splits S into two sets A = {y e S\y < x} and B = {y e S\y > x}. 
S is removed from S while A and B are inserted. 

• C <— Merge(j4, 5): Creates C = .4Ufi. C is inserted into 5 while A and -B are removed. 

We call a data structure that supports these operations a Mergeable Dictionary. In this paper 
we present a data structure, which implements these operations in amortized time O(logn), where n 
is the total number of items in IA. What makes the concept of a Mergeable Dictionary interesting is 
that the Merge operation does not require that the two sets being merged occupy disjoint intervals 



in keyspace. As we discuss in full detail in Section 1.2 a data structure for merging arbitrarily 
interleaved sets has appeared independently in the context of Union-Split-Find, Mergeable Trees 
and string matching in Lempel-Ziv compressed text. In all three cases, a o(log 2 n) bound on 
mergeable dictionary operations could not be achieved. We present a data structure that is able to 
break through this bound though use of a novel weighting scheme applied to an extended version 
of the Biased Skip List data structure |2|. Another alternative would be to extend and use Biased 
Search Trees [3] but we believe extending Biased Skip Lists will be easier, at least in terms of 
presentation. 

We first present a high-level description of the core data structure of the previous work, and show 
at a high level the method and motivation we use to improve the runtime. Given this description, 
we then can discuss in some detail the three aforementioned works. Finally, we present the full 
details of our result. 



1.1 High-Level Description 

The basic idea of the structure is simple. As a first attempt we show how to achieve C(log 2 n) time, 
which we outline here and fully present in Section [2] Store each set using an existing dictionary that 
supports Search, Split and Join 1 in O(logn) time (e.g. 2-4 trees). Thus, the only operation that 
requires a non-wrapper implementation is Merge. One first idea would be to implement Merge in 
linear time as in Merge- Sort, but this performs poorly, as one would expect. A more intelligent idea 
is to use a sequence of searches to determine how to partition the two sets into sets of segments 
that span maximal disjoint intervals. Then, use a sequence of Splits to split each set into the 
segments and a sequence Join operations to piece together the segments in sorted order. As the 
number of segments between two sets being merged could be 0(ra), the worst-case runtime of such 
an implementation is O(nlogn), even worse than the 0{n) of a brute-force merge. However, it is 
impossible to perform many Merges with a high number of segments, and an amortized analysis 
bears this out; there are only O(logn) amortized segments per Merge. Thus, since each segment 
can be processed in O(logn) time, the total amortized cost per Merge operation is 0(log 2 n). 

1 Join merges two sets but requires that the sets span disjoint intervals in keyspace 
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In [9j, it was shown that there are sequences of operations that have G(logre) amortized seg- 
ments per Merge. This, combined with the worst-case lower bound of r2(logn) for the dictionary 
operations needed to process each segment seemingly gives a strong argument for a ft (log 2 n) lower 
bound, which was formally conjectured by Lai. It would appear that any effort to circumvent 
this impediment would require abandoning storing each set in sorted order. We show this is not 
necessary, as a weighting scheme allows us finesse the balance between the cost of processing each 
segment, and the number of segments to be processed; we, in essence, prevent the worst-case of 
these two needed events from happening simultaneously. Our scheme, combined with an extended 
version of Biased Skip Lists, allows us to speed up the processing of each segment to 0(1) when 
there are many of them, yet gracefully degrades to the information-theoretically mandated G(logra) 
worst-case time when there are only a constant number of segments. The details, however, are nu- 
merous. We show in Section 3.3 how to augment Biased Skip Lists to support weighted finger 
versions of operations that we need. Given this, in Section [4] a full description of our structure is 
presented, and in Section [5] the runtime analysis is proved. 



1.2 Relationship to existing work 

The underlying problem addressed here has come up independently three times in the past, in the 
context of Union-Split-Find, Mergeable Trees and string matching in Lempel-Ziv compressed text. 
All three of these results, which were initially done independently of each other, bump up against 
the same 0(log 2 n) issue with merging, and all have some variant of the 0(log 2 n) structure outlined 
above at their core. While the intricacy of the latter two precludes us claiming here to reduce the 
squared logarithmic terms in their runtimes, we believe that we have overcome the fundamental 
obstacle towards this improvement. 



1.2.1 Searching in Lempel-Ziv 

In the paper of Farach and Thorup, an algorithm is presented for string matching in a Lempel-Ziv 
compressed string 6 1 . They show how to search for a string of length p in a compressed string of 
length n that was compressed by a factor of / in time 0(p + nlog 2 /). The algorithm is complex, 
but at its heart needs a data structure that can hold a set S of at most n integers in the range 
1 . . .U, and can perform an operation that shifts all values in some interval in S by some specified 
integer amount, so long as all values remain the range 1 . . .U. They show how to do this in time 
O (log IA log N) using a variant of the 0(log 2 n) structure described above. This variant allows a 
whole set to be shifted, and thus shifting an interval can be done with two splits and two merges 
in addition to a shift. The potential function they use to bound the number of segments is the 
same one as in our presentation of the 0(log 2 n) structure. A simple extension of our structure 
can speed up the needed shifts by a logn factor. However, since they do not completely use 
the aforementioned shifting data structure as a black box (there is an "unwinding" of some work 
performed there, among other subtleties), we leave the improvement of the runtime to 0{p+n\og f) 
only conjectured application of our result. 



1.2.2 Mergable Trees 

In the paper of Georgiadis, Tarjan, Werneck [8] and the follow-up tech report which adds the 
authors Kaplan and Shafrir the problem of Mergeable Trees is studied. This paper is concerned 
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with maintaining a dynamic collection of heaps, subject to operations which constitute the dynamic 
tree ADT: parent, root, nca, insert, link, cut, delete, and one additional operation, Merge, which 
makes things interesting. In the Merge operation, two nodes are specified, and the two root-to- 
node paths, which are each in sorted order due to the heap property, are merged. The idea of a 
Mergeable Tree ADT originated in an algorithm to compute the structure of a 2-maniford in R 3 
(The original paper has a algorithm which is shown in [1] to take time Q(y/n) per operataion) This 
ADT is a generalization of our Mergeable Dictionary, as if the heaps are restricted to be paths, 
both structures have identical functionality. They too obtain a O(logn) amortized bound on the 
number of segments, using the same potential function as in our 0(log 2 n) presentation, albeit using 
link-cut trees |12] as the underlying O(logn) structure due to their need to have the operations be 
performed on a tree paths in a heap rather than a totally ordered set, thus obtaining a 0(log 2 n) 
amortized bound on their operations. We conjecture that using the weighting scheme presented in 
this paper will allow the development of a Mergeable Tree with O(logn) runtime for all operations. 
This would require the development of a weighted variant of link-cut trees that support weighted 
finger searches. In effect, link-cut trees extend (2, 4) trees to allow operations on a tree topology, 
while biased skip lists allow sophisticated weighting operations. We would need a combination of 
both in order to achieve O(logn) amortized time Mergable Trees. While we conjecture such a 
combination is possible, the details to be worked out are numerous. 



1.2.3 Union-Split-Find 

In the MIT thesis of Lai, supervised by Demaine, the problem of Union- Split- Find is studied [9] 



This is proposed as a variant of the classic Union-Find data structure of 13 . Our Mergeable 
Dictionary ADT implements Find as Search which is a stronger operation than Find (e.g. Find 
can be implemented by returning the maxima of a set). They propose the (log 2 re) structure 
outlined above, and show that its amortized performance is il(log 2 n). They conjecture (correctly) 
that there is a potential function that gives C(log 2 re) runtime, but do not discover it, instead listing 
several potential functions which do not work. With thanks to Mihai Patrascu, they show that 
there is a lower bound of f2(logn) for this problem which follows from dynamic connectivity lower 



bounds 11 . This lower bound indicates our use of the stronger Search instead of the weaker Find 
comes at no additional asymptotic amortized cost. They conjecture (incorrectly) that this problem 
has a lower bound of 0(log 2 n) per operation; our C(logre) result refutes this. Our results directly 
provide an amortized optimal O (log re) time solution to their problem, while the best upper bound 
they could prove was 0{n). 



1.3 Future Work 

This paper opens up several avenues for future work. First, as previously stated we believe that 
our approach can remove a log n factor from the runtimes of the Lempel-Ziv searching algorithm 
and of Mergeable Trees. Both of these results will require integrating our result into each of these 
different and complicated results. But, since in both cases the same potential function is used 
as in the C(log 2 re) structure presentation below, we believe that our data structure provides the 
fundamental breakthrough needed to improve these results. 

Secondly, for simplicity we do not consider the dynamic case: our data structure always stores 
a collection of sets that partitions the same totally ordered set. Extending our result to allow 
insertion and deletion will require adding additional complexity to our weighting scheme. Finally, 
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we note that the idea that arbitrarily interleaved dictionaries can be merged in O(logn) amortized 
time is probably a surprising one, which at first glance probably appears to to impossible (which 
is supported by Lai's inability to get a o(n) solution). While previous results had elements of the 
Mergeable Dictionary ADT, and a 0(log 2 n) solution to it, here is the first clear abstraction of 
it as a pure dictionary problem. The fact that the three results above were initially discovered 
independently of each other also speaks to the "buried" nature of the fundamental problem in the 
previous work. We hope that this result finds applications which were not considered because of 
the seeming impossibility at first glance of a o(n) solution. 

2 A Simple Heuristic for Merge and the O(log 2 n) Amortized Bound 

As mentioned previously, the main difficulty in designing a data structure for our problem with 
o(n) worst-case time complexity lies in being able to perform the Merge operation fast. This is 
confirmed by a lower bound of £l(n) on the worst-case time complexity of merging two arbitrary 
sets p)] . We will describe a heuristic for the Merge operation presented in [5] and used in previous 
work [7j[9j, and show that the use of this heuristic yields o(n) amortized bounds as a warm up. 

Consider the Merge(^4, B) operation and a maximal subset in either set A or B such that all 
the elements of the other set are less than or greater than each element of the subset. We call this 
maximal subset a segment. We can view the Merge(A, B) operation as gluing the appropriate 
segments of set A and B. Consider, for instance, the Merge algorithm of Merge-Sort, which could 
be used to implement the Merge operation. The Merge algorithm linearly scans each segment 
until it locates its maximum element. The segment merging heuristic is based on that idea that 
there are more efficient methods of locating the maximum element of a set than a linear scan. 

Next, we make the notion of segments slightly more precise, describe the heuristic, and describe 
the potential function which yields an upper bound of amortized 0(log 2 n) time. 

2.1 The Segment Merging Heuristic 

Define a segment of the Merge(^4, B) operation to be a maximal subset S of either set A or set 
B such that no element in (Au B)\ S lies in the interval [min(S'), max(5)]. 

Each set in the collection is stored as a balanced search tree (i.e. 2-4 tree) with level links. 
The Find, Search, and Split operations are implemented 2 in a standard way to run in O(logn) 
worst-case time. The Merge(^4, B) operation is performed as follows: 

We first locate the minimum and maximum element of each segment of the Merge(A, B) 
operation using the Search operation and the level links, then extract all these segments using 
the Split operation, and finally we merge all the segments in the obvious way using a standard 
Join operation. Therefore since each operation takes O(logra) worst-case time, the total running 
time is 0(T ■ logn) where T is the number of segments. 

We now analyze all the operations using the potential method [14], with respect to two param- 
eters: n, the size of the universe, and m, the total number of all operations. 

Let Di represent the data structure after operation i, where Dq is the initial data structure. 
Operation i has a cost of q and transforms -Dj_i into Di. We have a potential function <I> : {Di} — >■ R 
such that $(Do) = and &(Di) > for all i. The amortized cost of operation i, Ci, with respect 

2 See 91 for a detailed description of this implementation. 
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to <5 is defined as q = Cj + — $(-Dj_i). The total amortized cost of m operations will be 

mm m m 

Y ^ = Y& + ^) - = X> + - $ ™ - 5^ Ci 

i=l j=l i=l i=l 

since <£>(.D n ) > and $(-Do) = 0. Thus, the amortized cost will give us an upper bound on the 
worst-case cost. 

Next, we describe a potential function which yields an amortized bound of 0(log 2 n) on the 
running time. This potential function was essentially used in [6j[7] which are the only instances 
where a o{n) solution has been presented. 

2.2 The Potential Function 

We need to define some terminology before describing the potential function. Let poss(x) be the 
position of x in set S, or more formally poss{x) = \{y G S \ y < x}\. Then gs(k), the size of the k th 
gap of set S, is the difference of positions between the element of position k and k + 1 of set S in 
universe U. In other words, gs(k) = posjj{x) — posu{y) where poss(x) = k and poss(y) = k + 1. 
For the boundary cases, let gs(0) = 9s(\S\) = 1. 

Recall that Di is the data structure containing our dynamic collection of disjoint sets, SW = 
{Sf\s^\...} after the i th operation. Finally, let (p{S) = E^i" 1 log ft? (j). Then we define the 
potential after the i th operation as follows: 

SeSW 

where Kd IS Si positive constant to be determined later. 

Note that since the collection of sets consists of the n singleton sets, the data structure initially 
has potential (<£(Do) = 0). Furthermore, because any gap has size at least 1, the data structure 
always has non-negative potential (Q(Di) > 0, Vi > 0). 

2.3 The Amortized 0(log 2 n) Bound 

The Find, Search, and Split operations have worst-case O(logn) running times. The first two 
of these operations do not change the structure and therefore do not affect the potential. Observe 
that the Split operation can only decrease the potential. Thus, the amortized cost of all three 
operations is O(logn). 

Now, suppose the i th operation is Merge(A, B) where A and B are sets in Di—\. Assume 
w.l.o.g. that the minimum element in AL)B is an element of A. Let I(A,B) = {Ai, B\, A2, B2, ■ ■ ■} 
be the set of segments of operation Merge(j4, B), where max(Aj) < min(Aj) and max(Sj) < 
mm(Bj) for i < j, and max(^4j) < min(i?j) < max(i?j) < min(Aj + i) for all i. As previously 
noted, the worst-case cost of the Merge operation is 0(\I(A, B)\ ■ logn). Let ai be the size of the 
gap between the maximum element of Ai and the minimum element of A^\, or more formally let 
ai = gAiUA i+1 (\Ai\)- Define h similarly. Now, let 



a'i = 9Ai\jBi(\Ai\) 
a" = gBiUA i+1 (\Bi\) 
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, Ai a\ A 2 a 2 

Zl □□ □ □ □□ | 1 □□ □ □ □ | 1 □ □ □□ 

Zl I I h-i □□ □ □ □□ | 1 □ □ □□ | 1 □□ □ □ □ | 1 □ □ □□ | =-| □ □ □□ | 1 □ □ □□ 

„ Bi hi B 2 b 2 B 3 

/-c □ □ □□ | 1 □ □ □□ | 1 □ □ □□ 

Figure 1: Gaps ai, bi, a\, b'[, b\, and b'( are denned with respect to the Merge(^4, B) operation, 
and 

b'i = ffB iU A i+1 (|A|) 

b" = gA i+1 uB i+1 (\A+i\) 

Note that a" = b[ and a\ = b"_ 1 (see Figure [l]). During the analysis we will take into account 
whether \I(A,B)\ is odd or even. Let a = \I{A,B)\ mod 2, and R = [(\I(A,B)\ - 2)/2j. We are 
now ready to bound the amortized cost of the Merge operation. We have 

ci = a + - *(A-i) 



$(A-i) = Yl v(S)K a logn 

SeS\{A,B} 



+ (y^ipiAi) + J2<p(Bi) \ K a logn 

+ ^^(logaj + logbi) + aloga R+ i^ K a log 



and 



$(A) = Yl v{S)K a \ogn 
SeS\{A,B} 



+ Yl + Yl + log a\ \ n a log n 

+ \ l0g a i + l0g h>i + l0g b i + l0g °*+lJ Ka l0g 72 

+ o ( lo g a 'A+l + lo § b R+l) aK a logn. 



This gives us 



$(A) - $(A-l) = ^ ^log<6' 4 6X +1 - 21ogaAJ + loga^J rc a log 



71 
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n 



+ Q ( lo g a JS+l + lo 8' & Wi) -logOR+i^ o-K a logn 
= ^ log aJX^i -2\og ai b}j + i (logai + loga^ +1 )j K a log 

+ Q ( lo g fl/ A+i +log^ + i) -loga/j+i^j oK a log n 

We have a'/, < < n and similarly 6'^ < bi < n. Also note that since a\ + a" < ai, we have 
log(4+log< < loga^+log(ai-4) < loga;/2+logai/2. Similarly, log^+log&f < log &i/2+log bi/2. 

*(A) " *(A-i) < ^ log o£WoJ - 21o gai 6ij K a logn + 0(log 2 n) 

\i=i 



' ' ' 1 K a logn + O(log^n) 



2 \ 'f- - ' a,i ■ ai ■ hi ■ hi 

R 



1 /W a l /2-a i /2-b i /2-b i /2\ 2 

^ o /^ lo g u — I K a logn + C log^n 

2 cn-ai-bi-bi J 

1 ( R 1 \ 
= 2 ( log 16 ) Ka logn + ^(^g 2 n ) 

= -k 6 • 1(4, B)n a log n + C(log 2 n). 



Recall that the worst-case cost of the Merge operation, q, is 0(\I(A, B)\ logn). Let k c be a 
constant such that a < k c ■ \I(A, B) \ logn. Then the above bound yields 

6i = Ci + - *(A-i) 

< a - K b ■ I (A, B)n a logn + C(log 2 n) 

< k c • I(A, B) log n - K b ■ I (A, B)K a log n + C(log 2 n) 
= C(log 2 n) (set K a = K c /nb). 

Thus, the amortized cost of the Merge operation is 0(log 2 n). Combined with the arguments 
before, this gives us the following theorem. 

Theorem 1. The Mergeable Dictionary problem can be solved such that a sequence of m Find, 
Search, Merge, and Split operations can be executed in 0(mlog 2 n) worst-case time. 

In order to obtain a data structure with an amortized running time of o(log 2 n) per operation, we 
certainly need a new potential function. To see this, observe the case when we have two singleton 
sets A and B with elements x G A and y £ B such that log(posu(x) — posjj(y)) = fi(logn). 



If we use the potential function defined in Section 2.2, the potential increase alone as a result 
of a Merge(A, B) operation is f2(log 2 n). We want to eliminate the extra logn factor in the 
potential function but this implies we need to be able to join segments in amortized 0(1) time. 
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We ultimately want a data structure such that each operation except Merge can be performed in 
worst-case O(logn) time, and Merge can be performed in worst-case 

O (logn + + 

time where Ai and Bj are segments involved in the operation, F(Ai) and F(Bj) denote the time it 
takes to process segments Ai and Bj respectively, and Yli F{Aj) + J2j F(Bj) is no more than the 
decrease in potential. 

In the next section, we describe biased skip lists, the underlying data structure we will be using 
in our data structure. 



3 Biased Skip Lists with Extended Operations 



Biased skip lists are a variant of skip lists 10 which we assume the reader is familiar with. Biased 
skip lists as described in |2| are missing some operations which will be vital in the implementation 
of our data structure. Therefore, in order to be able to design a highly tuned Merge operation, 
we will extend biased skip lists. 

First, we describe essential biased skip list details. The reader is referred to [2] for further 
details on biased skip lists. 

3.1 Biased Skip Lists 

We will first cover basic definitions followed by the three key invariants of the structure. 



Definitions A biased skip list (BSL) S stores an ordered set X where each element x G X 
corresponds to a node 3 x G S with weight w(x), which is user-defined, and integral height h(x), 
which is initially computed from the weight of the node. For our purposes, we will assume that the 
weights are bounded from below by 1 and bounded from above by a polynomial in n. 

Each node x G S is represented by an array of length h(x) + 1 called the tower of node x. 
The level- j predecessor, Lj(x), of x is the largest node k in S such that k < x and h(k) > j. 
The level- j successor, Rj(x), is defined symmetrically. The j th element of the tower of node x, 
contains pointers to the j th elements of towers of node Lj(x) and node Rj(x) with the exception 
of towers of adjacent nodes where pointers between any pair of adjacent nodes x and y on level 
min(/i(x), h(y)) — 1 are nil and the pointers below this level are undefined. Node levels progress 
from top to bottom. Two distinct elements x and y are called consecutive if and only if they linked 
together in S; or equivalently if and only if for all x < z < y, h(z) < mm(h(x), h(y)). A plateau 
is a maximal set of consecutive nodes of the same height. The rank of a node x is defined as 
r(x) = |log a w(x)\ where a is a constant to be specified later. For our purposes, we will set a = 2. 

Additionally, let predjy (x) be the predecessor of x in set X, and let succx(^) be the successor 
of x in set X. Let H(X) = max ie x h(x). Let S[*-<j] = {x G S \ x < j} and S[j->] = {x G S \ x > j}. 
Le^W(S) = Y^s^\^^W^S) = Z xeS . i ^w{x). 

For convenience, we imagine sentinel nodes —00 and +00 of height H(S) at the beginning and 
end of biased skip list S. These sentinels are not actually stored or maintained. 

3 We will use the terms "element", "node", "key", and "item" interchangeably; the context clarifies any ambiguity. 
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The left profile of x in a biased skip list S is denned as {Lj(x) \ h(Lj(x)) = j}. Similarly the 
right profile of x in a biased skip list S is defined as {Rj(x) \ h(Rj(x)) = j}. The profile of node x 
in a biased skip list S is the union of its left profile and right profile. 

The left cover of a biased skip list S is defined as {min(S)} U {i2j(min(5)) | h(Rj(imn(S))) = 
j, j > h(min(S))}. Similarly, the right cover of a biased skip list S is defined as {max(S')} U 
{Lj(max(5)) | h(Lj(max(S))) = j, j > /i(max(5))}. The cover of biased skip list S is the union of 
its left cover and right cover. 

Invariants The three invariants of biased skip lists are listed below. Note that a and b can be 
suitable constants satisfying the definition of (a, 6)-biased skip lists. For our purposes it is sufficient 
to set a = 2, b = 6. 

Definition 2. For any a and b such that 1 < a < [gj, an (a,b)-biased skip list is a biased skip list 
with the following properties: 

(10) Each item x has height h{x) > r(x). 

(11) There are never more than b consecutive items of any height. 

(12) For each node x and for all i such that r(x) < i < h(x), there are at least a nodes of height 
i — 1 between x and any consecutive node of height at least i. 

In the remainder of the paper, we will refer to (2, 6)-biased skip lists simply as biased skip lists. 
3.2 Operations 

We now describe the original biased skip list operations we will be using in our data structure. 

• p <— BSL-Search(S, i): Performs a standard search in biased skip list S using search key i. 
This operation runs in worst-case O(logn) time. 

• p <— BSL-FSEARCH(i, j): Starting from a given finger 4 to a node i in some biased skip list 
S perform a predecessor search in S using j as the search key. This operation runs in 



worst case time. 

• (A,B) BSL-Split(S', i): Splits the biased skip list S at i into two biased skip lists A and 
B storing sets {x £ S \ x < i} and {x G S \ x > i} respectively, and returns an ordered pair of 
handles to A and B. This operation runs in worst-case O(logn) time. 

• BSL-Rew(S, i, w): Changes the weight of node i £ S to w. This operation runs in worst-case 
O(logn) time. 

4 This operation is originally described with three arguments in [2] as FingerSearch(X,i, j). However, note that 
X is a redundant argument here as we already have a pointer to an element of X, namely, to i. 
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3.3 Extended Operations 



Biased skip lists support finger searches, however we need to extend biased skip lists to also support 
a finger split, a finger join, and a finger reweight operation. 

Finger Split Given a pointer to node / G S, BSL-FSplit(/) splits biased skip list S into two 
biased skip lists, A and B, storing sets {x G S \ x < /} and {x G S \ x > /} respectively, and returns 
an ordered pair of handles to A and B. 

(A,B) <- BSL-FSplit(/) : 

1. Disconnect the pointers between the each node in the right profile of node / and the left profile 
of node /' = succs(/), effectively splitting S and forming A and B. More precisely, disconnect 
pointers between the j th level of node Rj(f) and Lj(f') where j > min(/i(/), h(f)). Pointers 
below this level are already null. 

2. Restore (12) in A. We will process the nodes of the right cover of A after Step 1 in the order 
of increasing height. Denote the current node being processed by u. Let h! be the height of 
the node which was most recently processed. 

(a) If u = f, then set h' = h{u) and demote the height of u to r(u). 

(b) If (12) is not violated at u, then stop if h(u) > mm(H(S[^f\), H(S[f^])), and set ti = 
h{u) and iterate with the next node, -£/i( M )+i(^4), otherwise 5 . 

(c) If (12) is violated at node u, then demote the height of u to max(r(u), h'). Set h! to the 
height of u before the demotion. 

• If a demotion causes an (II) violation at the new height of u by creating a plateau 
of b' > b nodes, then promote the height of the median of these nodes by 1, and 
iterate at the level above to check for percolating (II) violations. 

• Once all the (II) violations are fixed, iterate with the next node, L/ i / + i( J 4), to fix 
the next potential (12) violation. 

Restore (12) in B essentially symmetrically. 

Correctness All the pointers in S connecting any node in A and any node in B are precisely 
those described in Step 1. Therefore Step 1 splits the nodes of A and B correctly. 

We need to ensure that all the invariants are preserved. When we perform demotions in Step 2 
we make sure that we do not demote the height of any node lower than its rank. Therefore (10) is 
preserved. Note that removing predecessors or successors cannot cause an (II) violation. 

Observe that in A, (12) can only be violated at the nodes in the right cover of A after Step 1, 
once per level. A symmetric argument holds for (12) violations in B. When we demote a node u 
this fixes the (12) violation at that node because the height of the node is either demoted to r(u) 
which by definition implies the violation is fixed, or to hi . If the height of node u is demoted to 
h', this implies that there was another node u' with height h' } consecutive to u, before the BSL- 
FSplit operation. The only way we change the height of any node between node u and node v! 

5 Note that L h ( u)+1 (A) can be computed in O(b) = 0(1) time due to (II); and min(H(S[^f]), H(S[f->])) can be 
computed during Step 1. 
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during BSL-FSplit(/) is an (II) promotion, which, cannot cause an (12) violation. Note that 
there cannot be an (12) violation at these nodes since they are not in the right cover of A after 
Step 1. We assume there were no (12) violations before the BSL-FSplit operation. Then it holds 
that there can be no (12) violations at node u at any level less than or equal to h! . Therefore, 
demoting u to level h' fixes any (12) violations at this node. 

Any of the demotions may cause an (II) violation which are also fixed by Step 2. Because we 
promote the median, this cannot cause an (12) violation since [b' /2\ > [b' /3\ > a. Also note that 
the (II) promotions caused by the demotion of a node never percolate higher than the height of 
the node before the demotion. Thus, nodes on the right cover of A after Step 1 that have not yet 
been processed during Step 2 are not removed from the right cover due to an (II) promotion. 

Observe that if a node u in the right cover of A after Step 1 has height greater than min(i?(S'[<- 1 /]), 
H(S[f^})) and there is no (12) violation at this node, then no other node of greater height in A 
can have an (12) violation. Symmetric arguments again apply for B. Thus, by induction, iterating 
Step 2 fixes all (12) violations. 

Finger Join Given pointers to t and r, the maximum and minimum nodes of two distinct biased 
skip lists A and B respectively, BSL-FJoin(£, r) returns a new biased skip list C containing all 
elements of A and B assuming i < r. A and B are destroyed in the process. 

S <- BSL-FJoiN(l,r) 

1. Connect pointers between each node in the right profile of node t and the left profile of node 
r, effectively joining A and B, and forming C. More precisely, create pointers between the 
j th level of node Rj(£) and Lj(r) where j > mm(h(£), h{r)). Pointers below this level need to 
already be null. 

2. Restore (II) in C. For j = m&x(h(l), h(r)) up through mm(H(A), H(B)), if there is an (II) 
violation at level j caused by a plateau of b' > b nodes, then promote the height of the median 
of these nodes by 1, and iterate at the next level. For j = mm(H(A), H{B)) + 1 up through 
max(H(A), H(B)), if there is no (II) violation at level j, then stop 6 . Otherwise promote the 
median node of the plateau of b' > b nodes causing the violation to restore (II) and iterate 
at the next level. 

Correctness Joining A and B only affects the right profile of t and the left profile of r. Therefore, 
Step 1 connects the two biased skip lists A and B and forms C correctly. Joining two biased skip 
lists cannot create any (12) violations assuming there were no such violations before the operation; 
and fixing (II) violations cannot create (12) violations since \b' /2\ > \b' /2>\ > a. Therefore, (12) 
is preserved. Note that after Step 1, any level in the range [max(h(£), h(r)),max(H(A), H(B))] 
could have at most 26 + 1 (b from A, b from B, and 1 due to a promotion from the level below) 
consecutive nodes of the same height; which is fixed in Step 2. By induction, iterating Step 2 fixes 
all (II) violations. 

Finger Reweight Given a pointer to a node / £ S, changes its weight to w while preserving 
invariants (10), (II), and (12) of the biased skip list containing /. 

6 min(H(A), H(B)) can be computed during Step 1. 
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BSL-FRew(/,w) 



1. Let r'(f) be the new rank of /. If r'(f) = r(f), then stop. 

2. If r'(f) > r(f) and h(f) > r'(f), then stop. 

3. If r'{f) > r(f) and h(f) < r'(f), then promote the height of / to r'(f). Restore (12) as in 
Step 2 of BSL-FSplit but start from the first node in the left profile of / that has height 
greater than h(f); and symmetrically from the first node in the right profile of / that has 
height greater than h(f). Then, restore (II) as in Step 2 of BSL-FJoin but start from level 
r'(f). 

4. If r'(f) < r(f), then demote the height of / to r'(f). Restore (II) as in Step 2 of BSL-FJoiN 
but starting from r'(f). Then, restore (12) as in Step 2 of BSL-FSplit, but starting from 
the first node in the left profile of / that has height greater than h(f); and symmetrically 
from the first node in the right profile of / that has height greater than h(f). 

Correctness If the rank of node / does not change, there are no structural changes to the biased 
skip list and therefore Step 1 is correct. 

If the rank of / increases but it is still less than its height, then (10) is preserved. By not 
changing the height of the node we ensure that (II) and (12) are preserved as well. Therefore, Step 
2 is also correct. 

If the rank of a node / becomes greater than its height, then (10) is violated and we promote / 
to its new rank to fix the violation. Observe that this promotion can cause (12) to be violated at the 
nodes in the left profile of / that have height greater than the old height of /; and symmetrically at 
the nodes in the right profile of / that have height greater than the old height of /, once per level. 
Step 3, by the correctness of Step 2 of BSL-FSplit, fixes all (12) violations. The promotion can 
also cause (II) to be violated at level r'(f). Step 3, by the correctness of Step 2 of BSL-FJoin, 
fixes all (II) violations. Therefore, Step 3 is correct. 

If the rank of a node / decreases, we demote / to its new rank so (12) cannot be violated at 
this node. Observe that this demotion can cause (II) to be violated at level r'(f). Step 4, by the 
correctness of Step 2 of BSL-FJoin, fixes all (II) violations. The demotion can also cause (12) to 
be violated at the nodes in the left profile of / that have height greater than the old height of /; 
and symmetrically at the nodes in the right profile of / that have height greater than the old height 
of /, once per level. Step 4, by the correctness of Step 2 of BSL-FSplit, fixes all (12) violations. 
Therefore, Step 4 is correct. 

Since steps 1-4 exhaust all possible scenarios, all the invariants are preserved and the BSL- 
FRew operation is correct. 

Before moving on, we need to analyze the time complexity of these new operations. 

3.4 The Analysis of Extended Operations 

We now analyze the time complexity of the extended operations described above using the potential 
method. 

The extended operations, BSL-FSplit, BSL-FJoin, and BSL-FRew, are executed on a set 
of biased skip lists. Let be the set of biased skip lists after the k th operation, where Lq is the 
initial set of biased skip lists we are given. 
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Let Vk be the set of plateaus which contain an element in the cover of some biased skip list in 
Lfc. For a plateau p, let \p\ be the number of nodes contained in plateau p. We define the following 
sets of plateaus. Let j = {p G Vk \ x < |p| < 2/} and S( xy ^ = {p G Vk \ x < \p\ < y}. We can 
now define a potential function as follows. 

MLk) = « 5 (8|Sf 0iO) | + 4|^ a] | + + 3|S { \ 26) | + 5|5f 26)26+1] |). 

Observe that $ e (Lfc) > for any k. For each extended operation, we will use this potential function 
to prove upper bounds on the amortized time complexity of the operation as well as worst-case 
time complexity of a sequence of operations. 

Lemma 3. The (A, B) <— BSL-FSplit(/) operation, where f £ S, has an amortized time com- 
plexity of 

0(mm{H(A'),H{B')) -min(r(max(i')),r(min(B'))) + 1) 
where A' = S[^f] and B' = 

Proof. Let BSL-FSplit(/) be the k th operation. Step 1 takes 0(mm(H(A'), H(B'))-mm(h(m&x(A')), 
h(mm(B')))+l) time. Step 2 takes constant time at each level from level min(r(max(^4')), r(mm(B'))) 
to level mm(H(A'), H{B')). We will show that the time spent by Step 2 on levels greater than 
mm(H(A'),H(B')) is essentially negligible by showing that it equals the decrease in potential at 
these levels. 

We now analyze the contribution of Step 1 and Step 2 to the potential change, <3? e (Lfc) — 

®e(L k -l)- 

Step 1 Due to the split, the plateaus on the cover of S becomes plateaus on the cover of A and 
B. Additionally, the plateaus on levels less than or equal to mm(H(A'), H(B')) on the right cover 
of A and left cover of B are added to Vk- Therefore, the contribution of Step 1 to the potential 
change is at most 0(min(H(A'), H(B')) - mm(h(m&x(A')), h(mm(B'))) + 1). 

Step 2 We now look at the contribution of Step 2 to the potential change, $ e (Lfc) — $ e (Lfc_i). 
Note that at any level less than or equal to mm(H(A'), H(B')), the potential increase can be at 
most a constant. Therefore, the maximum contribution of Step 2 to the potential change in these 
levels is 0(mm(H(A'), H(B')) — min(r(max(y4/)), r(mm(B')))). Next, we bound the contribution 
of Step 2 to the potential change in levels greater than mm(H(A'), H(B')). 

Demotions Consider a demotion operation to restore (12) at some node x. This demotion could 
cause a change in potential in two ways. 

First, the plateau p' which was causing the (12) violation could have had less than a nodes, and 
now has more. 

Note that since we demote node x to the level of p', there must be a plateau of at least a nodes 
(due to (12)) on the other side of node x. Therefore, p' will have at least a + 1 nodes. Also, the 
plateau on the other side of node x cannot have more than b nodes. Therefore, p' will have at most 
b + a nodes. This implies that p' can only have between a + 1 and b + a nodes. If p' has between 
a + 1 and 6—1 nodes, the contribution of this part to the potential change is — 8k 9 . If p' has b 
nodes, the contribution of this part to the potential change is — 8n g + k 9 = — 7n g . If p 1 has between 
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6+1 and 6 + a nodes, the contribution of this part to the potential change is — 8k 9 + 3k 9 = —5k 9 . 
Therefore, the maximum contribution of this part to the potential change is —5k 9 . 

Note that p' might not exist (have zero nodes). In this case, the maximum contribution of this 
part to the potential change is 3k 9 . However, this case is only possible if p' has height less than or 
equal to mm(H(A'), H(B')). 

Second, the demotion of x could cause the plateau p" , which x was a part of before the demotion, 
to have less nodes. If p" had 26 nodes or more and now has between 6+1 and 26 — 1 nodes, the 
contribution of this part to the potential change is — 5k 9 + 3k 9 = — 2k 9 . If p" had between 6+1 
and 26 — 1 nodes and now has 6 nodes, the contribution of this part to the potential change is 
— 3k 9 + k 9 = —2n g . If p" had 6 nodes and now has between a + 1 and 6—1 nodes, the contribution 
of this part to the potential change is — n g . If p" had between a + 1 and 6 — 1 nodes and now has a 
nodes, the contribution of this part to the potential change is 4n g . If p" had a nodes and now has 
between 1 and a — 1 nodes, the contribution of this part to the potential change is —Ak 9 + Sk 9 = 4n g . 
Additionally, if p" had between 1 and a — 1 nodes and now has zero nodes, the contribution of this 
part to the potential change is —8k 9 . 

Therefore, combining the first and second part, the maximum contribution of a demotion to 
the potential change is 7k 9 on levels less than or equal to m.in(H(A f ), H(B f )), and — k 9 on levels 
greater. 

Note that the demotion of x could cause a high number of plateaus which did not have any 
elements in the cover of their biased skip list to now have an element in the cover and thus enter 
Vk- In order for this case to occur, the height of x must be demoted at least two levels. By the 
description of Step 2, this implies there were no nodes of height h(x) — 1 in the biased skip list 
containing x after Step 1. Since S has no (12) violations before Step 1, this implies there must be 
nodes of height h(x) — 1 in the other biased skip list. Therefore, this case is only possible in levels 
less than or equal to m.m.(H(A'), H{B')). 

Promotions Consider a promotion operation at a node x to restore an (II) violation on the 
plateau p' node x is on. This promotion could cause a change in potential in two ways. 

First, the plateau p' could have had between 6 + 1 and 26 + 1 nodes, and now has less. If p' had 
between 6 + 1 and 26 — 1 nodes, then the promotion of node x splits p' into two plateaus. Only one 
of these plateaus remain in the cover of the biased skip list unless nodes of p' were at the highest 
level of the biased skip list. The plateau remaining in the cover must have size greater than a and 
less than 6. Therefore, p' will now have between a + 1 and 6 — 1 nodes, and the contribution of this 
part to the potential change is —3k 9 . In the special case of both plateaus remaining in the cover, 
then both of them will have between a + 1 and 6—1 nodes, and the contribution of this part to 
the potential change is —3k 9 . If p' has 26 or more nodes, then the promotion of x splits p' into 
two plateaus. Only one of these plateaus remain in the cover of the biased skip list unless nodes of 
p' were at the highest level of the biased skip list. The plateau remaining in the cover must have 
size either 6 — 1 or 6. If it has 6—1 nodes, the contribution of this part to the potential change is 
—5n g . If it has 6 nodes, the contribution of this part to the potential change is — 5k 9 + k 9 = —4k 9 . 
In the special case of both plateaus remaining in the cover, then either one of them has 6 — 1 
nodes, and the other one has 6 nodes, and the contribution of this part to the potential change is 
— 5k 9 + k 9 = —4:K g ; or both of them have 6 nodes, and the contribution of this part to the potential 
change is — 5k 9 + 2k 9 = —3n g . 

Second, the promotion of x could cause the plateau p", which x becomes a part of after the 
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promotion, to have more nodes. If p" had more than 26 — 1 nodes, note that it must have had 
at most 26 nodes. Promotion of x increases the size of p" to 26 + 1. Thus, it does not change 
its set and the contribution of this part to the potential change is 0. Note that, in general, if the 
promotion does not change the set of p", then the contribution of this part to the potential change 
is 0. If p" had 26 — 1 nodes and now has 26 nodes, the contribution of this part to the potential 
change is — 3k 9 + 5n g = 2n g . If p" had 6 nodes and now has 6 + 1 nodes, the contribution of this 
part to the potential change is — n g + 3n g = 2n g . If p" had 6—1 nodes and now has 6 nodes, the 
contribution of this part to the potential change is K g . If p" had a nodes and now has a + 1 nodes, 
the contribution of this part to the potential change is — 4n g . If p" had a — 1 nodes and now has a 
nodes, the contribution of this part to the potential change is — 8n g + 4n g = — 4n g . Additionally, if 
p" had zero nodes and now has 1 node, the contribution of this part to the potential change is 8n g . 
However, this can only happen if an earlier demotion caused the only node of p" to be demoted. 
Therefore, p" first has 1 node; then a demotion causes p" to have nodes; and then a promotion 
associated with that demotion causes p" to have 1 node again. The effect on the potential change 
is zero. 

Therefore, combining the first and second part, the maximum contribution of a promotion to 
the potential change is — K g . 

Assume the worst-case running time of any single demotion or promotion operation is bounded 
by a constant Kh- Let c k and c k respectively be the amortized and worst-case running time of 
BSL-FSplit(/) and let v k be the number of violations that are restored during Step 2 above level 
mm(H(A'),H(B')). Then we have c k < 0(mm(H(A'), H(B'))-mm(r(max(A')),r(mm(B')))+l) + 
K h~ Vk- Combining the bounds on the maximum contribution of Step 1 and Step 2 to the potential 
change and setting K g = k^, we have & e (L k ) — <f> e (L k _{) < —v k Kh + 0(mm(H(A'), H{B')) — 
min(r(max(yl / )), r(min(i? / )))). This yields 

4 = c k + $ e (L fc ) - $e{L k -l) 

< n h v k + <5> e (L k ) - MLk-i) + 0(mm{H(A'),H{B')) - mm(r(max{A')),r(mm{B'))) + 1) 

< K h v k - K h v k + 0(m.m(H(A'),H(B')) - mm(r(max(A')),r(mm(B'))) + 1) 
= 0(mm(H(A'),H(B')) - min(r(max(^ / ))^(min( J B'))) + 1). 

This completes the proof. □ 

Lemma 4. Given a set Lq = {A±, A2, . . . , A m } of biased skip lists, a sequence of (S k ,T k ) ■<— 
BSL-FSPLiT(/fc) operations for 1 < k < t where f k G U k and U k £ L k -i can be executed in 
worst-case time 



where S' k = U k [^fk] and T' k = U k [fk^]. 

Proof. Let c k and c k respectively be the amortized and worst-case running time of BSL-FSplit(_/^). 
Note that $ e (L ) = C(l^o|) = O (E™i(#( A i) ~ min(/i(min(A;)), /i(max(A;))) + 1)). Then we 
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have 



t 



k=l k=l 
t t 

k=l k=l 
t t 

k=l k=l 
t t 



k=l k=l 

/ t 



O \^2(mm(H(S' k ),H(T^) - min(r(max(^)), r(min(2£))) + 1)J By Lemma| 
+ $ e (£ ) 

O ^(min(i?(^),//(^)) - min(r(max(5^)),r(min(^))) + 1)^ 
+ O ^(if(Ai) - min(/i(min(A i )),^(max(A i ))) + l)j . 



□ 



Lemma 5. The S <— BSL-FJoin(^, r) operation, where I £ A, r £ B, has an amortized time 
complexity of 

O (mm(H(A),H(B)) - min(/i(max(A)), h(mm(B))) + 1) . 

Proof. Let BSL-FJoin(^, r) be the k th operation. Step 1 takes 0(mm(H(A), H(B))-mm(h(max(A)), 
/i(min(i?)))+l) time. Step 2 takes constant time at each level from level min(/i(max(^4)), h(mm(B))) 
to level mm(H (A), H{B)). We will show that the time spent by Step 2 on levels greater than 
mm(H (A) , H (B)) is essentially negligible by showing that it equals the decrease in potential at 
these levels. 

We now analyze the contribution of Step 1 and Step 2 to the potential change, <£ e (Lfc) — 

$e(Lfc-l). 

Step 1 Due to the join, the plateaus on levels greater than min(H (A) , H (B)) as well as the 
plateaus on left cover of A and right cover of B becomes plateaus on the cover of 5*. The plateaus 
on levels less than or equals to min(H(A), H(B)) on the right cover of A and on the left cover of 
B are not on the cover of S and thus are not added to Vk- 

The only way Step 1 can increase the potential is if all the plateaus on the covers of A and B 
becomes plateaus on the cover of S and the plateaus containing nodes max(A) and min(I?) merge 
and form a larger plateau with more weight with respect to the potential function. Therefore, the 
contribution of Step 1 to the potential change is at most 3n g . 
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Step 2 Note that Step 2 only involves promotions. Any promotion on level less than or equal to 
mm(H (A) , H (B)) will increase the potential by at most a constant and total contribution to the 
potential change is bounded by 0(min(H(A), H(B)) — min(/i(max(A)), h(min(B))) + 1). We will 
focus on promotions which occur on higher levels. 

Consider a promotion operation at a node x to restore an (II) violation on the plateau p' node 
x is on where p' is on a level greater than mm(H(A), H(B)). The promotion could cause a change 
in potential in two ways. 

First, the plateau p' could have had 6+1 nodes, then the promotion of node x splits p' into 
two plateaus with between a + 1 and 6—1 nodes each. Then, the contribution of this part to the 
potential change is — 3n g . 

Second, the promotion of x could cause the plateau p" , which x becomes a part of after the 
promotion, to have more nodes. Note that due to (II), p" could not have had more than 6 nodes. 
If p" had 6 nodes and now has 6+1 nodes, the contribution of this part to the potential change 
If p" had less than 6 nodes and now has at most 6 nodes, the potential could 
increase by at most a constant, but the promotion will not cause an (II) violation at p", and Step 
2 will terminate. 

Therefore, combining the first and second part, the maximum contribution of a promotion to 
the potential change is — n g except possibly for the last promotion. 

Assume the worst-case running time of any single promotion operation is bounded by a constant 
Kj. Let Cfc and c k respectively be the amortized and worst-case running time of BSL-FJoin(^, r) 
and let v k be the number of violations that are restored during Step 2 above level min(i7(^4), H(B)). 
Then we have Cfe < 0{min(H(A), H(B)) — min(/i(max(^4)), h(min(B))) + 1) + Kj -v k . Combining the 
bounds on the maximum contribution of Step 1 and Step 2 to the potential change and setting K g = 
Kj, we have $ e {L k )-® e (L k -i) < -v k Kj + 0{m.ui{H{A), H{B)) -min(/i(max(A)), h{min{B))) + 1). 
This yields 



4 = c k + <S> e (L k ) - <f>e(L k -i) 

< K jVk + $ e (L fc ) - $ e (L fc _i) + 0(mm(H(A), H{B)) - min(/i(max(A)), h(mm(B))) + 1) 

< Kjv k - Kjv k + 0(mm(H(A),H(B)) - min(/i(max(A)), ^(min(B))) + 1) 
= 0(mm(H(A), H{B)) - min(fc(max(A)), h{mm(B))) + 1). 

This completes the proof. □ 

Lemma 6. Given a set Lq = {A±, A2, . . . , A m } of biased skip lists, a sequence ofU k <— BSL-FJom(£ k ,r k ) 
operations for 1 < k < t where £ k £ S k , r k G T k and S k ,T k £ L k -\ can be executed in worst-case 
time 

O (^2(mm(H(S k ),H(T k )) - mm(h(m^(S k )),h(mm(T k ))) + l)j + 

O W A *) " min^min^)), fc(max(Ai))) + 1) j . 

Proof. Let c k and c k respectively be the amortized and worst-case running time of BSL-FJoin(4, r k ). 
Note that $ e (L ) = O(\V \) = O (HZi( H ( A i) ~ min(/i(min(A i )), /i(max(Ai))) + 1)). Then we 
have 
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t 



^2c k = ^2(c k + $ e (Lfc) - $ e (Ljfc_i)) 
k=l k=l 
t t 

fc=l fc=l 

k=l k=l 
t t 

Y Ck - Y 6k + $ e(^o) 



fc=l fc=l 

/ * 



= O ^^(mm^S^, - min(/i(max(S fe )),/i(min(T fc ))) + 1)J By Lemma| 

+ $ e (L ) 

= O (^(mm(H(S k ),H(T k )) - min(/i(max(S fe )), h(mm(T k ))) + l)j 
+ yC(#(Ai) " min(/ i (min(A i )),/i(max(A i ))) + l)j . 

□ 

Lemma 7. The BSL-FRew(/, w) operation, where f £ S, has a worst-case and amortized time 
complexity of 

0(max(tf (S), /(/)) - mm(h(f), r'(/)) + 1). 

Proof. BSL-FRew(/, w) only spends constant time at each level between min(/i(/), r'(f)) and 
max(/i(/),r'(/)) for promoting or demoting /; and at most constant time at each level between 
mm(h(f),r'(f)) and m&x(H(S), r'(f)) for restoring invariants. Therefore, the worst-case complex- 
ity of BSL-FRew(/, w) is 0(max(H(S), r'(/)) - min(/i(/), r'(/)) + 1). 

Let BSL-FRew(/, w) be the operation. Note that no plateaus that are on levels less than 
min(/i(/), r'( f)) are affected by the operation. Since the potential increase associated with each level 
is bounded by a constant, we have Q e (L k ) — & e (L k _i) = 0(max(H(S), r'(f)) — min(/t(/), r' (/))). 
Thus, the lemma follows. □ 

Lemma 8. The BSL-FSearch, BSL-FSplit ; BSL-FJoin, and BSL-FRew operations all have 
a worst-case time complexity of O(logn). 

Proof. Let W be the sum of the weights of all elements in the sets involved in any of these four 
operations. By our previous assumptions, W is bounded from above by a polynomial in n. Since 
these versions of the operations are more efficient than the non-finger versions which take C(log W) 
time in the worst case, these operations have a worst-case time complexity of O(logn). □ 

We now present our data structure, the Mergeable Dictionary, which improves the worst-case 
bound in Theorem [T] by a factor of logn, matching the lower bound of 11 . 
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4 Our Data Structure: The Mergeable Dictionary 



The Mergeable Dictionary stores each set in the collection S as a biased skip list. The weight of 
each node in each biased skip list is determined by S. When the collection of sets is modified, 
for instance via a Merge operation, in order to reflect this change in the data structure, besides 
splitting and joining biased skip lists, we need to ensure the weights of the affected nodes are 
properly updated and biased skip list invariants (10), (II), and (12) are preserved. For simplicity 
we assume that Dq is the collection of singleton sets and Di, for all i, partitions the universe IA. 
This lets us precompute, for each node x, pos^(x), the global position of x. 

For the Merge algorithm, we will use the same basic approach outlined in Section [2j the 
segment merging heuristic, which works by extracting the segments from each set and then gluing 
them together to form the union of the two sets. 

As previously mentioned, while we do not have control over the number of segments that need 
to be processed which has been shown to have an amortized lower bound of O(logn) per operation, 
we need to process each segment faster. In order to do so, we depart from balanced search trees 
and instead use Biased Skip Lists [2] with the extended operations we introduced in Section [3j 

Before we discuss the implementation of each operation in detail, we need to describe the 
weighting scheme. 

4.1 Weighting Scheme 

Let the weight of a node x, w(x), be the sum of the sizes of its adjacent gaps. In other words, if 
poss(x) = k for some node x G S, then we have 

w{x) = g s {k - 1) + gs(k). 

Recall that gs(0) = gs(\S\) = 1. Observe that this implies for any set S, W(S) < In. 

4.2 The Find, Search, and Split Operations 

The Find(x) operation can simply return the maximum element of the set which contains x by 
invoking BSL-FSearch(z, +oo). The Search(A% %) operation can be performed by simply invok- 
ing BSL-Search(X, i). The Split(X, i) operation can be performed by simply invoking BSL- 
Split(X, i) and running BSL-FRew on one node in each of the resulting biased skip lists to restore 
the weights. 

4.3 The Merge Operation 

We will use the Mergeable Dictionary in Figure[2]to illustrate the Merge operation. The Merge(j4, B) 
operation can be viewed as having four essential phases: finding the segments, extracting the seg- 
ments, updating the weights, and gluing the segments. A more detailed description follows. 

Phase I: Finding the segments Assume min(^4) < min(l?) w.l.o.g. Let z = \\I{A, B)\/2\ and 
v = U-^A B)\/2\ . Recall that I(A, B) = {A\ : B\, A2, B%, . . .} is the set of segments associated 
with the Merge(A, B) operation where Ai and Bi are the i th segment of A and B respectively. 
We have min(^i) = min(yl) and min(i?i) = min(i3). Given min(^4j) and min(Sj), we find max(Aj) 
by invoking BSL-FSEARCH(min(^4j), min(£?j)). Similarly, given mm(Bi) and mm(Ai + i) we find 
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max(-Bj) by invoking BSL-FSEARCH(min(i?j), min(^4j + i)). Lastly, given max(^4j) and max(£?j), 
observe that min(.Aj+i) = succ J 4(max(^4j)) and min(i?j+i) = succs(max(i?i)). Note that the succQ 
operation is performed in constant time in a biased skip list using the lowest successor link of a 
node. At the end of this phase, all the segments are found (see Figure [3]). Specifically, we have 
computed for all i and j (min(Aj), max(j4j)) and (mm(Bj), max(Bj)). 

Phase II: Extracting the segments Since we know where the minimum and maximum node 
of each segment is from the previous phase, we can extract all the segments easily in order by 
invoking BSL-FSPLiT(max(^4j)) for 1 < i < z, and BSL-FSPLiT(max(Bj)) for 1 < j < v (see 
Figure El) . 



■■■■■■ 




2 4 6 2 



222223333333333 
5790233456777 



2 6 2 4 



1 3 5 9 3 5 7 9 1 



4 4 4 4 4 4 4 

22222333334 



5 5 5 6 6 7 7 



5 7 9 1 3 5 7 



:::::::!::: 



3344444444444444444555 
9966777777778888999001 
8989012345791357159375 



Figure 2: We have two biased skip lists A (top) and B (bottom). The tower of each node is 
represented by a set of vertically stacked squares which represent individual levels of a node. The 
predecessor /successor pointers are also shown. The lightly shaded levels exist due to promotions. 
The position of each node x, pos^(x), is indicated at the bottom of the tower of each node. 
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683G924579023345G77788999900001 1 1 12222233333444445625815174070 
024628406208642048260246135935791357913579135791735791593719 




334444444444 
996677777777 
898901234579 



444444555 
888999001 
357159375 



Figure 3: At the end of Phase I, we have all the minimum and maximum nodes of the segments 
of the Merge(A, 5) operation. The minimum nodes are colored blue, the maximum nodes are 
colored red. 
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Phase III: Updating Weights Next, we need to update the weights of the affected nodes. Let 
the new weight of item x be w'(x). Then 

For 2 < i < z, let 

u/(min(^4j)) = w(mm(Ai)) + post/ (max(A_i)) - pos[/(max(i3j_i)), 
For 2 < i < v, let 

w' (min(Bi)) = w(mm(Bi)) + pos[/(max(Sj_i)) — posu(max(Ai)), 
For 1 < i < z, let 

w'(max(Ai)) = w(max(Ai)) + posu{min(Bi)) - posu(m.in(A i+ i)), 
For 1 < i < v, let 

w'(max(Bi)) = w(max(Bi)) + posu(mm(A i+ i)) - posu(mm(B i+ i)). 

We also have 

ti/(min(l?i)) = w(mm(Bi)) — 1 + posu(mm(Bi)) — posu(max(Ai)). 
If 1 1 (A, B)\ is even, we have 

w'(max(A z )) = w(max(A z )) — 1 + posu(min(B z )) — posu(max(A z )). 
If \I(A,B)\ is odd, we have 

w' (max(B v )) = w(max(B v )) — 1 + posu(min(A z )) — posu(max(B v )). 
We can perform these weight updates (see Figure [5]) by invoking 

• BSL-FREw(min(A),ii/(min(A;))) for 2 < i < z, 

• BSL-FREw(max(y4 i ),u; / (max(A i ))) for 1 < i < v, 

• BSL-FREw(min(S j ),u; / (min(S j ))) for 1 < j < v, 

• BSL-FREw(max( J B J ),u; / (max( J B i ))) for 1 < j < z. 

Phase IV: Gluing the segments Since we assumed w.l.o.g. that min(A) < mm(B), the correct 
order of the segments is (A\, B\, A2, B2, ■ ■ ■) by construction. We can glue all the segments by 
invoking BSL-FJoiN(max(Aj), min(£?j)) for 1 < i < v and BSL-FJoiN(max(i?j), min(^4 i+ i)) for 
1 < i < z (see Figure [6]). 

This concludes the presentation of our data structure. In the next section we will analyze the 
running time of each of the 4 operations. 
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11 1222223333333333333333 
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35791593719 



44444444444444444555 
66777777778888999001 
89012345791357 159375 



Figure 4: At the end of Phase II, all the segments are extracted. These extractions cause invariants 
(II) and (12) to be violated, which are then restored by BSL-FSplit. Hollow levels denote 
demotions. 




2591 1 1222223333333333333333 
68369245790233456777889999 
024628405208642048260246 



========== ========== = ============!! I 



4444444444444444 
000011 1122222333334444456 
1359357913579135791357917 
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444444444444444445S5 
66777777778888999001 
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Figure 5: The weights of the minimum and maximum nodes of each segment are affected by the 
Merge(j4, B) operation. Accordingly, in Phase III, we update the weights of these nodes. This 
update causes (10) violations. The BSL-FRew operation first restores (10). Restoring (10) causes 
(II) and (12) violations which are then again restored by BSL-FRew. The hollow levels denote 
demotions and green levels denote promotions. 
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2691112222233333333333333333344444444444444444444444444444444444444444455555566778901 
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0246284062086420482602468013503573135791357913579178901234573135715337535701503710 



Figure 6: In Phase IV, we join all the segments to obtain the union of A and B. These joins cause 
invariants (II) and (12) to be violated, which are then restored by BSL-FJoin. The green levels 
denote promotions. 
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5 Analysis of the Mergeable Dictionary 



Before we can analyze the amortized time complexity of the Mergeable Dictionary operations, 
we need a new potential function which we will present in Section 5.1 We then prove that all 



the operations except Merge have a worst-case and amortized time complexity of O(logn) in 
Section |5,2| Lastly, we will show that the Merge operation has an amortized time complexity of 



O(logn) in Section 5.3 



5.1 The New Potential Function 

Let Di be the data structure containing our dynamic collection of disjoint sets, = {S^ , , • • •} 
after the i th operation. Let 

<P(S) = ^20oggs(poss(x) - 1) + log gs(pos s (x))). 
xes 

Then we define the potential after the i th operation as follows. 

3 

where Kd is a constant to be determined later. 

Note that the main difference between this function and the one in Section [2?2] is the elimination 
of the log n term. 



5.2 The Analysis of the Find, Search, and Split Operations 

We now show that all the operations except Merge have a worst-case time complexity of C(log n), 
and they do not cause a substantial increase in the potential which yields that their amortized time 
complexity is also O (log n) . 

Theorem 9. The worst-case and amortized time complexity of the Find(x) operation, the Search(5', x) 
operation, and the Split(5, x) operation is O(logn). 

Proof. The worst-case time complexity of the BSL-FSearch operation invoked by the Find and 
Search operations is C(logn) by Lemma |8j Recall that since these operations do not change the 
structure, the potential remains the same. Therefore, worst-case and amortized time complexity 
of Find and Search is O(logn). The worst-case time-complexity of the BSL-Split and BSL- 
Rew operations invoked by the Split operation is O(logn) by Lemma [8| Observe that Split can 
only decrease the potential. Therefore, the worst-case and amortized time complexity of Split is 
O(logn). □ 

5.3 The Analysis of the Merge Operation 

All we have left to do is show that the amortized time complexity of the Merge operation is 
O(logn). In order to do this, first we will show that the worst-case time complexity of the 
Merger, B) operation is O (log n + £\ F(A») + £ . F(Bj)) . We define F(Ai) and F{B j ) next. 
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Definition 10. Consider the Merge(^4, B) operation. Recall that w'(x) is the new weight of node 
x after the Merge(A,5) operation. Also recall that z = \\I(A,B)\/2] and v = [\I(A, B)\/2\ . 
Then, for 1 < i < z, let 

u;(max(Ai_i)) + w(mm(A i+1 )) + ^2 xeA w(x) 

F{Ai) = log 



min(ti/(max(I3j_i)), w'(min(Ai)), w'(m&x(Ai)), w'(mm(Bi))) 

and for 1 < j < v, let 

_ w(max(5,-_i)) + w{min{B j+1 )) + ^ xeB . w(x) 

3 mm(w' (max.(Aj)) , w' (min(Bj)) , w' (max(Bj)) , w' (mm(Aj + i))) 

For the boundary cases, let = F(B 1 ) = F(A Z ) = F{B V ) = logn. 

5.3.1 The Worst- Case Time Complexity 

We need to bound the worst-case time complexity of each phase of the Merge(^4, B) operation. 
Lemma 11. Phase I of the Merge operation has a worst-case time complexity of 

(z-l v-l 
logn + J2 F ( A i) + J2 F ( B i 
i=2 j=2 

Proof. During Phase I, BSL-FSEARCH(min(ylj), t) is invoked for 1 < i < z where pred^(i) = 
max(^4j) and succ^(i) = min(j4j + i); and BSL-FSEARCH(min(i?j), s) is invoked for 1 < j < v 
where pred B (s) = max(Sj) and succb(s) = mm(Bj + i). Observe that w' (mm(Bi)) < w(mm(Ai + i)) 
and w' (mm(Bi)) < w(m.ayi(Ai)). Therefore, by the definition of F(Ai) and F(Bj), and Lemma pi 



the worst-case time complexity of Phase I is O (log n + YH=2 F (Ai) + Y?j=2 F (Bj)J ■ D 
We will need the following lemma to bound the worst-case time complexity of Phase II-IV. 

Lemma 12. Given a biased skip list S and any node f £ S, recall that S[<-<f] = {x G S\ x < /}. 
Then, 

H(S[~f]) < \ogW(S[^f]). 

Proof. Let R = max xg 5[^ / ] r(x) and N r (t) = {x G S[^f]\r(x) = t}. Also, let Nh(t) = {x G 
j h(x) > t, r(x) < t}. Then we have 

R 

W(S[^f])>Y, 2lN r{i)+ Yl w ^ 

i=2 x<=S[^f], 
r(x)=l 

R 

> ^2*iV r (i) + 2JV h (l). 

i=2 

Due to (12), we have 

N h (i)>2(N h (i + l)-N r (i + l)) 
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> 2 R - 1 N h (R) - ^2^ 1 iv: r 

R 

> 2 H ^ f ^' 1 N h {H{S[^f})) - V^Nrd) N r (t) = for t>R 

i=2 



i=2 

which yields 



□ 



W(S[*f])> J^VNrii) + 2N h (l) 

i=2 

R / R \ 

> ^ VN r (i) + 2 2 H ^ f ^- 1 - 2 i_1 iVr(i) 

i=2 V i=2 / 

> 2 ff(S[~/]) 
logW(5H])>fT(5h/]). 

Lemma 13. Phase II of the Merge operation has a worst-case time complexity of 

(z-l u-1 
i=2 j=2 

Proof. During Phase II, BSL-FSPLiT(max(^4j)) is invoked for 1 < i < z, and BSL-FSPLiT(max(Bj)) 
is invoked for 1 < j < v. Combining Lemma [4] and Lemma 12 yields that the worst-case time com- 
plexity of Phase II is 

O (^T{\ogW{Ai) -min(r(min(Ai)),r(max(Ai)),r(min(A i+ i))) + l)j + 

(v-l 
^(logW(S i )-min(r(min(Sy)),r( max(J3j)),r(min(5j4.i))) + 1) 
i=l 

Observe that io'(min(Si)) < w(Ai + \) and w\m\vi{Aj + i)) < w(Bj + \). Then, by Lemma[8]and the 
definitions of F(Ai) and F(Bj), the worst-case complexity of Phase II is 



Z— 1 V— 1 



C ( logn + ^F^O + ^F^ 

i=2 i=2 



□ 
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Lemma 14. Phase III of the Merge operation has a worst-case time complexity of 

(z-l v-1 \ 

logn + J2 F ( A i) + J2 F W " 
i=2 j=2 J 

Proof. During Phase III, we invoke 

• BSL-FREw(min(A i ),u; / (min(Aj))) for 2 < i < z, 

• BSL-FREw(max(y4 i ),u; / (max(A i ))) for 1 < i < v, 

• BSL-FREw(min(S J ),u; / (min(S j ))) for 1 < j < v, 

• BSL-FREw(max( J B i ),u; / (max( J B i ))) for 1 < j < z. 

Let t\ = min(-B). If \I(A,B)\ is even, let t 2 = max(A), otherwise let t 2 = max(l?). Observe 
that r'(x) < r(x) < if and only if x {ii,^}- Then, by Lemma [7] and Lemma 12, BSL- 
FRew(x,w'(x)) has a worst-case time complexity of 

• O (log W{Ai) - r'(min(Ai)) + 1) for all x G {mm(Ai) | 1 < i < z}, 

• 0(log W(Ai) - r'(max(Ai)) + 1) for all x G {max(^) 1 1 < i < z} \ {t 2 }, 

• 0(log W(Bj) - r'(mm(Bj)) + 1) for all x G {mm(Bj) | 1 < j < v} \ {h}, 

• 0(logW(Bj) - r'(max(B j )) + 1) for all x G {max(Bj) \l<j<v}\ {t 2 }, 

and a worst-case time complexity of O(logn) for x G {ti, ^2}- 

Therefore, by Lemma[8]and the definitions of F(Ai) and F(Bj), the worst-case time complexity 

of Phase III is O (logn + J2t=2 F ( A i) + EJ=2 • 

□ 

Lemma 15. Phase IV of the Merge operation has a worst-case time complexity of 

(z-l v-1 \ 

logn + J>(^) + I>(£i) . 
i=2 j=2 J 

Proof. During Phase IV, we invoke BSL-FJoiN(max(74j), min(l?j)) for 1 < i < v and we invoke 
BSL-FJoiN(max(i3j), min(^4j + i)) for 1 < % < z. Combining Lemma[6]and Lemma 12 yields that 
the worst-case time complexity of Phase IV is 

O [f^logW(A % ) — min(r(max(i3j_i)), r(min(Aj)), r(max(^4j))) + 1)^ + 
O J J2(logW(Bj) - mm(r(max(A j )),r(mm(B j )),r(max(B j ))) + 1) 
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which by Lemma [8] and the definitions of F(Ai) and F(Bj) yields 

(z-l u-l \ 

i=2 j=2 / 

□ 

The next theorem bounds the worst-case time complexity of the Merge(j4, B) operation. 
Theorem 16. The Merge(A, B) operation has a worst-case time complexity of 



z-l u-l 



i=2 j=2 



Proof. The worst-case time complexity of the Merge(t4, B) operation is determined by the time 



it spends on each of the four phases. Therefore, by lemmas [TT| [13 14, and 15 the theorem follows. 

□ 

5.3.2 Amortized Time Complexity 

Before we can show that the amortized time complexity of the Merge(A, B) operation is 0(log n), 
we will need to prove 3 lemmas. Let us first define the potential loss associated with a gap. Recall 
the definitions of gaps ai,a' { , a" and similarly bj , b 1 - , b" first defined in Section 



2.3 



Definition 17. We define pl(ai) and pl(bj), the potential loss associated with gap a% and the 
potential loss associated with gap b{ respectively, for 1 < % < z and 1 < j < v, as follows: 

pl(a,i) = 2 log a, - log a • - log a" (1) 
pl(b J ) = 2logb j -logb' j -logb'; (2) 

Assume w.l.o.g. that min(A) < min(S). Then we also let pl(ao) = and 

pl(b ) = -loga' 1 . (3) 

If max(A) > max(B), then we have pl{a z ) = and 

MM = -log <-i • (4) 

Otherwise, i/max(^4) < max(S), we have pl(b v ) = and 

pl(a z ) = -log&"-i- ( 5 ) 

Note that the potential loss associated with operation Merge (A, B) is Kd times the sum of all 
defined pl(oj) and pl(6,), where is the constant in the potential function. 
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Lemma 18. Consider gap for any 1 < i < z. Let af = max(o-, a") and a~ = min(a^, a"). Then 
we have 

2-pl(ai) < < < 2PK«i) an d 2 -M^) < — < 2 pl ( ai \ 
~ af ~ aY ~ ~ a~ ~ 

Similarly, consider gap bj for any 1 < j < v, where bj = max(6^,6") and bj = min(i^, b"). Then 
we have 

< 4 < h. < 2 f%) and 2~ pl ^ < < 2 pl ^\ 
~ bj -bj~ ~ bj ~ 

Proof. We will prove the first part dealing with gap dj. The proof of the second part is symmetric. 
Assume w.l.o.g. that a\ < a". We have 



log a" + log a ■ - 2 log a; = — pl(c 



a; 



log — + log = pl(a. 



which gives us 
and consequently 



log ^ < pl(a») and log < pl(ct;), 



^<2^) and ^<2^\ 
a a'- 



Because < a" < aj, we have 



Also, note that 



a,- a 



<^< 2 pl(a i ). (6 ) 



°± < 2 Pl(«i) 
1 < a! 



1 a' a,- 

< < 
a; a'; 



2 pl(a0 -„.-„/> 



and similarly, 



Lastly, observe that 



a? a,- 



2 pl(a l ) - a - a //" V J 



< 2 pl ^ by ©, 



28 



and 



2P 1 ( ( M a\ 

This concludes the proof. Second part of the lemma can be proven using symmetric arguments. 

□ 

Lemma 19. Let I(A, B) = {A\,B\, A2, B2, ■ ■ ■} be the set of segments of with respect to operation 
Merge(j4, B). For any i and j, where 1 < i < z and 1 < j < v , let 



a; = max 



2PK a i-2) 2^(^-2) 2 pi ( cti - 1 ) 2 pl ( hi ~^ 2 p'(°*) 2 pl< " bi ^ 2 pl ( ai+1 ^ 



anc 



(3j = max (2 pl ^- 2 \ 2 J,i (°J- 1 ), 2 pl ^~ 1 \ 2 pl ^ ai \ 2 pl ^ bi \ 2 pl ^ ai+1 \2 pl ^ +1 ^\ . 

Then for 1 < i < z and 1 < j < v , we have 

F{Ai) = O(logai) and F(Bj) = (log ft). 

Proof. We will present the proof of the first equality. The proof of the second one is analogous. 
Let a'-_ 1 = b' i _ 1 = x. Then, by Lemma 18 we have 



I < < 2 pK^-i) and 



2P l(a l _i) - a ^_ i 



which imply 



and 



< 



2Pl(6i-i) - 
< a-_! < X 2 pl(ai - l) < xai 



b" 

< < 2 P 1 ( fe *-i) 



(8) 
(9) 



Note that a\ = b"_ 1: and b'-_ 2 = By Lemma 18 we have 



1 < °i < 2 pl( ai ) and 1 < b j-2 < 2 pl( 6! _ 2 ) 



2 pl(6 i _ 2 ) - »_ 2 



which imply 



and 



x 1 

< 



— < a'! < 2 pl ^ • xai < xaj 
a 2 - 2P i( a< ) a . - * - 1 - 1 



x 1 

a ' 



2 - 2PK&i-2) ai 



C < b\_ 2 < 2 pl( - b ^ ■ xa % < xaj 



(10) 
(11) 



Note that a"_ 2 = b[_ 2 , and E = a". By Lemma 
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we have 



_1 < a J_=l < 2 PlK- 2 ) and 1 < b l < 2 P%) 
2 pl(a i _ 2 ) - a //_ 2 - 2 P 1 (6i) ~ ~ 
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which imply 



and 



x 1 x 

a 3 - 2Pl(°i-2) a 2 - u *-2 



< a'_9 < 2 pl(ai " 2) • ra? < xaf 



—I- < -4tt • 4 < b'l < 2 pl ^ ■ xaf < — 3 



Note that a^ +1 = b'[. By Lemma 18, we have 

1 



which implies 



x 



< 



2 pl(a !+1 ) - a '. +i 
1 X 



< '+ 1 < 2P 1 ( cli + 1 ) 



< a'L, < 2 pl ^+^ ■ xaf < xaf 



Similarly, by Lemma 18, we have 



_1 < < 2 p««*-0 an d — < Jpl < 2^ b >~ 
2 pi(a l _i) - a w_ i - 2P 1 ( fe >-i) - lf._ x ~ 



which imply 



and 



x x 
< 



at ~ 2P 1 ( fl i-i) 



X X 

< 



< Oi_i < x2 p1 ^- 1 ) < 



By Lemma 18, we have 



which imply 



and 



2P 1 (°0 _ a- 

x 1 x 



1 <^< 2 p1K) and _JL^ < ^=2 < 2 pi( 6i _ 2 ) 

2 pl(6i_ 2 ) ~ 6'/_ 2 ~ 

< a 4 < 2 pl(ai) • m; < xaf 



a 2 ~ 2P 1 ( Q l ) a* 
1 x 



a 2 " 2^-2) ' a, " 



< 6,;_ 2 < 2 pl(fel ~ 2) • xcti < xaf 



By Lemma 18, we have 



1 „ a-i-2 



1 . 6,: 



2P1K-2) - a '!_ 2 



< ^zl < 2 Pi(«i-2) and < ^ < 2 p%) 



2p^) ~ E 



which imply 



and 



^ < - ,/ s • 4 < «i_ 2 < 2 pl ^- 2 ) • xaf < xaf 
a 3 2P 1 («*-2) a 2 



4, < — • < 6i < 2 pl(bl) • ra, 2 < xaf 
a? ~ 2P I (6i) a 2 - 
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By Lemma 18, we have 

1 < £*±1 < 2 P 1 (»i+i) 

2pl(ai+i) - a ^ +l - 

which implies 

4 < TT S • ^ < <Hfl < 2 pl ( a *+ 1 ) • xaf < ra, 4 . (21) 
a 4 - 2Pi(»i+i) a 3 - «+* - v ; 

We proceed as follows. For 1 < i < z, we have 

u;(max(Aj_i)) + io(min(.Aj+i)) + J2xeA- w ( x ) 

F(Ai) = log — — - — — — ttx~\ — 77 \ ' 777vT\ by definition of F(-). 

min(w'(max(i?j_i)), ur (mm(AjJ), w '(max(^4jj), w (mm{Bi))) 

^, °i-2 + <H-l + w(min(A i+1 )) + £ A uj(<e) 

— 1°S t — 77 777 — 77 — 77 ttttt — 77 ' — by definition of w(-). 

m.m(w (max(i?j_i)), w (min(-Ai)), w [max(Ai)), w (mm(Bi))) 

Oi-2 + + di + di+1 + Yjx&Ai W ( X ) , , „ , , v 

< log — — — — - — - — - — — -^7 — — — — . 7t-ttt by definition of w(-). 

mm{w {max[Bi^i)),w {mm{Ai)),w (max(Ai)), w'(mm[&i))) 

. . a^2 + Oi-i +ai + a,i + i + 2bi-i 

— 1°S 7 — 77 7 77 77 T7TT7 77 T^TTTT 77 7 77T D y definition of W{-). 

min(iu (max(Bi-i)j, w (min^jj), w (max(Aj)), w (mm(Bi))) 

Oi_ 2 + di-x +di + a i+ i + 26j_i 

< log : 777 77 / ,// x by dehmtion of w{-). 

mm (ti.Vi. fl «. t i-i) 
4 



□ 



' i 1 ° S -i,( il ,,t 1 X.tl,) J by 0,00, and O 

= O(l0gai). 

The proof of the second equality, F(Bj) = 0(log/3j) for 1 < j < v, is analogous. 

Lemma 20. For 1 < i < z and 1 < j < v, we have 

7- l^log2«+J>g2^) j > (j>gai + J>g& 

Proof. Observe that a gap can be mapped to at most seven times by unique a^s and fy's; namely 
only by aik-i, fa-i, otk, fa, otk+li fa+l> &k+2' Similarly, a gap bk can be mapped to at most seven 
times by unique o^'s and /3/s; namely only by /3 fe _i, a k , fa, &k+\, fa+i, &k+2, fa+2- The lemma 
follows. □ 

We are now ready to bound the amortized time complexity of the Merge operation. 

Theorem 21. The Merge(j4, B) operation has an amortized time complexity of O(logn). 



Proof. We will analyze the Merge operation using the potential method J14J. Recall that Di 
represent the data structure after operation i, where Dq is the initial data structure. The amortized 
cost of operation i is t\ = a + <3?(-Dj) — $(Dj_x). Then the amortized cost of the Merge(A, B) 
operations is 
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6i = a + A$ 



2-1 



U-l 



by Theorem 16 



i=2 3=2 / 

(z-1 u-l 
i=2 3=2 

(z-1 u-l 
i=2 3=2 
u-l 

^F(^i) + ^F(^) ] - Kd ■ [ ^pl(a 4 ) + ^pl(^) ] absorbed by O(logn). 

i=2 3=2 
/ Z-1 U-l 



= 0(logn 
= 0(logn 
= C(logn 
< C(logn 



< C(logn 

< C(logn 



( ^pl(oi) + 5^pl(6^ 

3=0 



/c<2 comes from <£. 



i=2 
'z-i 



3=2 
u-l 



7 



i=0 
'z-1 



i=2 
'z-1 



u-l 



3=2 

u-l 



log oti + log f3j J by Lemma 20 

3=2 



+ K f ( E] log ai + E] log 

3=2 



i=2 



7 



i=2 

' z-1 



u-l 



J^logai + J^log/3 i 

3=2 



by Lemma 19 



i=2 



Set («d = 7k/). 



□ 



We can now state our main theorem. 



Theorem 22. The Mergeable Dictionary executes a sequence of m Find, Search, Split, and 
Merge operations in worst-case O(mlogn) time. 



Proof. Follows directly from Theorem p}] and Theorem 21 



□ 
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